Exploring Entity-Centric Methods in the UK Government Web Archive
نویسندگان
چکیده
Being able to explore large digital collections effectively is of interest to both academics and practitioners alike. The need to go beyond the provision of keyword-driven functionality to features that support exploration and discovery is widely recognised. In addition, providers are seeking to support more diverse groups of users with varying information needs and tasks. Increasing amounts of cultural heritage are being stored in web archives that present unique challenges as a form of digital cultural heritage. This paper describes a collaboration between the University of Sheffield and the UK National Archives to investigate entity-based methods for exploring the UK Government Web Archive.
منابع مشابه
Presenting a method for extracting structured domain-dependent information from Farsi Web pages
Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...
متن کاملContextualizing Obesity and Diabetes Policy: Exploring a Nested Statistical and Constructivist Approach at the Cross-National and Subnational Government Level in the United States and Brazil
Background This article conducts a comparative national and subnational government analysis of the political, economic, and ideational constructivist contextual factors facilitating the adoption of obesity and diabetes policy. Methods We adopt a nested analytical approach to policy analysis, which combines cross-national statistical analysis with subnational case study comparisons to examine...
متن کاملInto the Dark Domain: The UK Web Archive as a Source for the Contemporary History of Public Health
With the migration of the written record from paper to digital format, archivists and historians must urgently consider how web content should be conserved, retrieved and analysed. The British Library has recently acquired a large number of UK domain websites, captured 1996-2010, which is colloquially termed the Dark Domain Archive while technical issues surrounding user access are resolved. Th...
متن کاملMulti-aspect Entity-Centric Analysis of Big Social Media Archives
Social media archives serve as important historical information sources, and thus meaningful analysis and exploration methods are of immense value for historians, sociologists and other interested parties. In this paper, we propose an entity-centric approach to analyze social media archives and we define measures that allow studying how entities are reflected in social media in different time p...
متن کامل